<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="utf-8" />
    
  <meta name="description" content="兴趣使然的博客" />
  
  <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
  <title>
     Sapphire
  </title>
  <meta name="generator" content="hexo-theme-yilia-plus">
  
  <link rel="shortcut icon" href="/favicon.ico" />
  
  
<link rel="stylesheet" href="/css/style.css">

  
<script src="/js/pace.min.js"></script>


  

  

<link rel="alternate" href="/atom.xml" title="Sapphire" type="application/atom+xml">
</head>

</html>

<body>
  <div id="app">
    <main class="content">
      
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/font-awesome/css/font-awesome.min.css">
<script src="/live2d-widget/autoload.js"></script>
<section class="cover">
    
      
    
  <div class="cover-frame">
    <div class="bg-box">
      <img src="/images/cover1.jpg" alt="image frame" />
    </div>
    <div class="cover-inner text-center text-white">
      <h1><a href="/">Sapphire</a></h1>
      <div id="subtitle-box">
        
        <span id="subtitle"></span>
        
      </div>
      <div>
        
      </div>
    </div>
  </div>
  <div class="cover-learn-more">
    <a href="javascript:void(0)" class="anchor"><i class="ri-arrow-down-line"></i></a>
  </div>
</section>



<script src="https://cdn.jsdelivr.net/npm/typed.js@2.0.11/lib/typed.min.js"></script>

<div id="main">
  <section class="outer">
  <article class="articles">
    
    
    
    
    <article id="post-2020-02-26" class="article article-type-post" itemscope
  itemprop="blogPost" data-scroll-reveal>

  <div class="article-inner">
    
    <header class="article-header">
       
<h2 itemprop="name">
  <a class="article-title" href="/2020/02/26/2020-02-26/"
    >pearson相关系数</a
  >
</h2>
  

    </header>
    

    
    <div class="article-meta">
      <a href="/2020/02/26/2020-02-26/" class="article-date">
  <time datetime="2020-02-26T08:09:38.000Z" itemprop="datePublished">2020-02-26</time>
</a>
      
  <div class="article-category">
    <a class="article-category-link" href="/categories/%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0/">学习笔记</a>
  </div>

      
      
      
    </div>
    

    

    
    <div class="article-entry" itemprop="articleBody">
      


      

      
      <p><strong>pearson相关系数</strong></p>
<p><img src="../picture/pearson%E5%85%AC%E5%BC%8F.png" alt="avatar"></p>
<p>公式定义为： 两个连续变量(X,Y)的pearson相关性系数(Px,y)等于它们之间的协方差cov(X,Y)除以它们各自标准差的乘积(σX,σY)。系数的取值总是在-1.0到1.0之间，接近0的变量被成为无相关性，接近1或者-1被称为具有强相关性。</p>
<p>简单来说，它用来衡量两个数据集合是否在一条线上面，是否有相关性，这在数据分析中是很有效的。</p>
<p>用python3实现：</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> math</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">pearson</span><span class="params">(vector1, vector2)</span>:</span></span><br><span class="line">    n = len(vector1)</span><br><span class="line">    <span class="comment">#simple sums</span></span><br><span class="line">    sum1 = sum(float(vector1[i]) <span class="keyword">for</span> i <span class="keyword">in</span> range(n))</span><br><span class="line">    sum2 = sum(float(vector2[i]) <span class="keyword">for</span> i <span class="keyword">in</span> range(n))</span><br><span class="line">    <span class="comment">#sum up the squares</span></span><br><span class="line">    sum1_pow = sum([pow(v, <span class="number">2.0</span>) <span class="keyword">for</span> v <span class="keyword">in</span> vector1])</span><br><span class="line">    sum2_pow = sum([pow(v, <span class="number">2.0</span>) <span class="keyword">for</span> v <span class="keyword">in</span> vector2])</span><br><span class="line">    <span class="comment">#sum up the products</span></span><br><span class="line">    p_sum = sum([vector1[i]*vector2[i] <span class="keyword">for</span> i <span class="keyword">in</span> range(n)])</span><br><span class="line">    <span class="comment">#分子num，分母den</span></span><br><span class="line">    num = p_sum - (sum1*sum2/n)</span><br><span class="line">    den = math.sqrt((sum1_pow-pow(sum1, <span class="number">2</span>)/n)*(sum2_pow-pow(sum2, <span class="number">2</span>)/n))</span><br><span class="line">    <span class="keyword">if</span> den == <span class="number">0</span>:</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0.0</span></span><br><span class="line">    <span class="keyword">return</span> num/den</span><br></pre></td></tr></table></figure>
<p>选择两组数据</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">vector1 &#x3D; [2, 7, 18, 88, 157, 90, 177, 570]</span><br><span class="line">vector2 &#x3D; [3, 5, 15, 90, 180, 88, 160, 580]</span><br><span class="line">print(&#39;result is: &#39; + int(pearson(vector1, vector2)))</span><br></pre></td></tr></table></figure>
<p>运行结果为0.998，可见这两组数是高度正相关的</p>
<figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">result is: 0.998348748644</span><br></pre></td></tr></table></figure>

<p>&emsp;&emsp;美国零售业有这样一个案例，美国沃尔玛百货将他们的纸尿裤和啤酒并排摆在一起销售，结果纸尿裤和啤酒的销量双双增长。<br>原来，美国的太太们常叮嘱她们的丈夫下班后为小孩买尿布，而丈夫们在买尿布后又随手带回了两瓶啤酒。<br>这一消费行为导致了这两件商品经常被同时购买。这其实是经过数据挖掘、趋势分析后做出的决策。</p>
<hr>
<p>参考：<a href="https://blog.csdn.net/AlexMerer/article/details/74908435" target="_blank" rel="noopener">统计学三大相关系数之皮尔森（pearson）相关系数</a><br>&emsp;&emsp;&emsp;<a href="https://www.jianshu.com/p/a8349052a2a0" target="_blank" rel="noopener">从啤酒和纸尿裤，你能想到什么？</a></p>

      
      <!-- reward -->
      
    </div>
    
    
      <!-- copyright -->
      
    <footer class="article-footer">
      
      
  <ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E5%A4%A7%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90/" rel="tag">大数据分析</a></li></ul>


    </footer>

  </div>

  

  
  
  

  

</article>
    
    <article id="post-first_blog" class="article article-type-post" itemscope
  itemprop="blogPost" data-scroll-reveal>

  <div class="article-inner">
    
    <header class="article-header">
       
<h2 itemprop="name">
  <a class="article-title" href="/2020/02/19/first_blog/"
    >平平淡淡的一天</a
  >
</h2>
  

    </header>
    

    
    <div class="article-meta">
      <a href="/2020/02/19/first_blog/" class="article-date">
  <time datetime="2020-02-19T10:19:11.000Z" itemprop="datePublished">2020-02-19</time>
</a>
      
  <div class="article-category">
    <a class="article-category-link" href="/categories/%E6%91%B8%E9%B1%BC%E9%9A%8F%E7%AC%94/">摸鱼随笔</a>
  </div>

      
      
      
    </div>
    

    

    
    <div class="article-entry" itemprop="articleBody">
      


      

      
      <h1 id="我的一天："><a href="#我的一天：" class="headerlink" title="我的一天：  "></a>我的一天：  </h1><hr>
<p>上午：改bug  </p>
<hr>
<p>下午：<del><strong>摸摸鱼</strong></del> 远程培训</p>
<hr>
<p>晚上：打FF14  </p>
<hr>
<p><img src="../picture/first.jpg" alt="avatar"></p>
<hr>

      
      <!-- reward -->
      
    </div>
    
    
      <!-- copyright -->
      
    <footer class="article-footer">
      
      
  <ul class="article-tag-list" itemprop="keywords"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/%E7%AC%AC%E4%B8%80%E6%9D%A1%E5%8D%9A%E5%AE%A2/" rel="tag">第一条博客</a></li></ul>


    </footer>

  </div>

  

  
  
  

  

</article>
    
  </article>
  

  
</section>
</div>

      <footer class="footer">
  <div class="outer">
    <ul class="list-inline">
      <li>
        &copy;
        2020
        Ryan Shu
      </li>
      <li>
        
          Powered by
        
        
        <a href="https://hexo.io" target="_blank">Hexo</a> Theme <a href="https://github.com/Shen-Yu/hexo-theme-ayer" target="_blank">Ayer</a>
        
      </li>
    </ul>
    <ul class="list-inline">
      <li>
        
        
        <span>
  <i>PV:<span id="busuanzi_value_page_pv"></span></i>
  <i>UV:<span id="busuanzi_value_site_uv"></span></i>
</span>
        
      </li>
      <li>
        <!-- cnzz统计 -->
        
      </li>
    </ul>
  </div>
</footer>
    <div class="to_top">
        <div class="totop" id="totop">
  <i class="ri-arrow-up-line"></i>
</div>
      </div>
    </main>
      <aside class="sidebar">
        <button class="navbar-toggle"></button>
<nav class="navbar">
  
  <div class="logo">
    <a href="/"><img src="/images/ayer-side.svg" alt="Sapphire"></a>
  </div>
  
  <ul class="nav nav-main">
    
    <li class="nav-item">
      <a class="nav-item-link" href="/">主页</a>
    </li>
    
    <li class="nav-item">
      <a class="nav-item-link" href="/archives">归档</a>
    </li>
    
    <li class="nav-item">
      <a class="nav-item-link" href="/categories">分类</a>
    </li>
    
    <li class="nav-item">
      <a class="nav-item-link" href="/tags">标签</a>
    </li>
    
    <li class="nav-item">
      <a class="nav-item-link" href="/aboutme">关于我</a>
    </li>
    
  </ul>
</nav>
<nav class="navbar navbar-bottom">
  <ul class="nav">
    <li class="nav-item">
      
      <a class="nav-item-link nav-item-search"  title="搜索">
        <i class="ri-search-line"></i>
      </a>
      
      
      <a class="nav-item-link" target="_blank" href="/atom.xml" title="RSS Feed">
        <i class="ri-rss-line"></i>
      </a>
      
    </li>
  </ul>
</nav>
<div class="search-form-wrap">
  <div class="local-search local-search-plugin">
  <input type="search" id="local-search-input" class="local-search-input" placeholder="Search...">
  <div id="local-search-result" class="local-search-result"></div>
</div>
</div>
      </aside>
      <div id="mask"></div>

<!-- #reward -->
<div id="reward">
  <span class="close"><i class="ri-close-line"></i></span>
  <p class="reward-p"><i class="ri-cup-line"></i>请我喝杯咖啡吧~</p>
  <div class="reward-box">
    
    <div class="reward-item">
      <img class="reward-img" src="/images/alipay.jpg">
      <span class="reward-type">支付宝</span>
    </div>
    
    
    <div class="reward-item">
      <img class="reward-img" src="/images/wechat.jpg">
      <span class="reward-type">微信</span>
    </div>
    
  </div>
</div>
      
<script src="/js/jquery-2.0.3.min.js"></script>


<script src="/js/jquery.justifiedGallery.min.js"></script>


<script src="/js/lazyload.min.js"></script>


<script src="/js/busuanzi-2.3.pure.min.js"></script>


<script src="/js/share.js"></script>



<script src="/fancybox/jquery.fancybox.min.js"></script>




<script>
  try {
    var typed = new Typed("#subtitle", {
    strings: ['兴趣使然的博客','',''],
    startDelay: 0,
    typeSpeed: 200,
    loop: true,
    backSpeed: 100,
    showCursor: true
    });
  } catch (err) {
  }
  
</script>




<script>
  var ayerConfig = {
    mathjax: false
  }
</script>


<script src="/js/ayer.js"></script>


<script src="https://cdn.jsdelivr.net/npm/jquery-modal@0.9.2/jquery.modal.min.js"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/jquery-modal@0.9.2/jquery.modal.min.css">





<script type="text/javascript" src="https://js.users.51.la/20544303.js"></script>
  </div>
</body>

</html>